AITopics

Country: Europe > United Kingdom > England (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)
Questionnaire & Opinion Survey (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Neural Information Processing SystemsApr-30-2026, 08:25:43 GMT

or Sound Symbolism in Vision and Language Models

Although the mapping between sound and meaning in human language is assumed to be largely arbitrary, research in cognitive science has shown that there are non-trivial correlations between particular sounds and meanings across languages and demographic groups, a phenomenon known as sound symbolism. Among the many dimensions of meaning, sound symbolism is particularly salient and welldemonstrated with regards to cross-modal associations between language and the visual domain. In this work, we address the question of whether sound symbolism is reflected in vision-and-language models such as CLIP and Stable Diffusion. Using zero-shot knowledge probing to investigate the inherent knowledge of these models, we find strong evidence that they do show this pattern, paralleling the well-known kiki-bouba effect in psycholinguistics. Our work provides a novel method for demonstrating sound symbolism and understanding its nature using computational tools. Our code will be made publicly available1.

artificial intelligence, machine learning, natural language, (18 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Neural Information Processing SystemsFeb-18-2026, 00:41:01 GMT

Morris Alper and Hadar A verbuch-Elor Tel Aviv University Contents

Unless stated otherwise we use American English for transcriptions.

artificial intelligence, machine learning, natural language, (17 more...)

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.40)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.95)
Questionnaire & Opinion Survey (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)

Neural Information Processing SystemsFeb-18-2026, 00:40:58 GMT

or Sound Symbolism in Vision and Language Models

That which we call a rose by any other name would smell as sweet."

artificial intelligence, machine learning, natural language, (18 more...)

Country:

North America > United States (0.14)
Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
Africa > Namibia (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Neural Information Processing SystemsFeb-8-2026, 22:38:38 GMT

5f1d3986fae10ed2994d14ecd89892d7-Supplemental.pdf

dataset, github repository, indicator, (16 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.48)

arXiv.org Artificial IntelligenceDec-4-2025

SkillFactory: Self-Distillation For Learning Cognitive Behaviors

Sprague, Zayne, Lu, Jack, Wadhwa, Manya, Keh, Sedrick, Ren, Mengye, Durrett, Greg

Reasoning models leveraging long chains of thought employ various cognitive skills, such as verification of their answers, backtracking, retrying by an alternate method, and more. Previous work has shown that when a base language model exhibits these skills, training that model further with reinforcement learning (RL) can learn to leverage them. How can we get models to leverage skills that aren't exhibited by base models? Our work, SkillFactory, is a method for fine-tuning models to roughly learn these skills during a supervised fine-tuning (SFT) stage prior to RL. Our approach does not rely on distillation from a stronger model, but instead uses samples from the model itself, rearranged to provide training data in the format of those skills. These "silver" SFT traces may be imperfect, but are nevertheless effective for priming a model to acquire skills during RL. Our evaluation shows that (1) starting from SkillFactory SFT initialization helps a model to generalize to harder variants of a task post-RL, despite lower performance pre-RL; (2) cognitive skills are indeed used by the model; (3) RLed SkillFactory models are more robust to regression on out-of-domain tasks than RLed base models. Our work suggests that inductive biases learned prior to RL help models learn robust cognitive skill use.

large language model, machine learning, natural language, (21 more...)

2512.04072

Country: North America > United States > Minnesota (0.28)

Genre: Research Report > New Finding (0.67)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
(2 more...)

arXiv.org Artificial IntelligenceNov-26-2025

Breaking Bad: Norms for Valence, Arousal, and Dominance for over 10k English Multiword Expressions

Mohammad, Saif M.

Factor analysis studies have shown that the primary dimensions of word meaning are Valence (V), Arousal (A), and Dominance (D). Existing lexicons such as the NRC VAD Lexicon, published in 2018, include VAD association ratings for words. Here, we present a complement to it, which has human ratings of valence, arousal, and dominance for 10k English Multiword Expressions (MWEs) and their constituent words. We also increase the coverage of unigrams, especially words that have become more common since 2018. In all, the new NRC VAD Lexicon v2 now has entries for 10k MWEs and 25k words, in addition to the entries in v1. We show that the associations are highly reliable. We use the lexicon to examine emotional characteristics of MWEs, including: 1. The degree to which MWEs (idioms, noun compounds, and verb particle constructions) exhibit strong emotionality; 2. The degree of emotional compositionality in MWEs. The lexicon enables a wide variety of research in NLP, Psychology, Public Health, Digital Humanities, and Social Sciences. The NRC VAD Lexicon v2 is freely available through the project webpage: http://saifmohammad.com/WebPages/nrc-vad.html

artificial intelligence, lexicon, natural language, (20 more...)

2511.19816

Country:

North America > United States (1.00)
Europe (0.93)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.46)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.68)

arXiv.org Artificial IntelligenceNov-19-2025

Can QE-informed (Re)Translation lead to Error Correction?

Padmanabhan, Govardhan

The paper presents two approaches submitted to the WMT 2025 Automated Translation Quality Evaluation Systems Task 3 - Quality Estimation (QE)-informed Segment-level Error Correction. While jointly training QE systems with Automatic Post-Editing (APE) has shown improved performance for both tasks, APE systems are still known to overcorrect the output of Machine Translation (MT), leading to a degradation in performance. We investigate a simple training-free approach - QE-informed Retranslation, and compare it with another within the same training-free paradigm. Our winning approach selects the highest-quality translation from multiple candidates generated by different LLMs. The second approach, more akin to APE, instructs an LLM to replace error substrings as specified in the provided QE explanation(s). A conditional heuristic was employed to minimise the number of edits, with the aim of maximising the Gain-to-Edit ratio. The two proposed approaches achieved a Delta COMET score of 0.0201 and -0.0108, respectively, leading the first approach to achieve the winning position on the subtask leaderboard.

large language model, machine learning, natural language, (17 more...)

2511.13884

Country:

North America > United States (0.29)
Europe > United Kingdom (0.28)
Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Morita, Takashi, O'Donnell, Timothy J.

Unsupervised Classification of English Words Based on Phonological Information: Discovery of Germanic and Latinate Clusters

arXiv.org Artificial IntelligenceOct-28-2025

Cross-linguistically, native words and loanwords follow different phonological rules. In English, for example, words of Germanic and Latinate origin exhibit different stress patterns, and a certain syntactic structure, double-object datives, is predominantly associated with Germanic verbs rather than Latinate verbs. As a cognitive model, however, such etymology-based generalizations face challenges in terms of learnability, since the historical origins of words are presumably inaccessible information for general language learners. In this study, we present computational evidence indicating that the Germanic-Latinate distinction in the English lexicon is learnable from the phonotactic information of individual words. Specifically, we performed an unsupervised clustering on corpus-extracted words, and the resulting word clusters largely aligned with the etymological distinction. The model-discovered clusters also recovered various linguistic generalizations documented in the previous literature regarding the corresponding etymological classes. Moreover, our findings also uncovered previously unrecognized features of the quasi-etymological clusters.

artificial intelligence, machine learning, natural language, (19 more...)

2504.1177

Country:

Europe > United Kingdom > England (0.68)
North America > United States > California (0.67)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Ding, Cui, Yin, Yanning, Jäger, Lena A., Wilcox, Ethan Gotlieb

Modeling Bottom-up Information Quality during Language Processing

arXiv.org Artificial IntelligenceOct-28-2025

Contemporary theories model language processing as integrating both top-down expectations and bottom-up inputs. One major prediction of such models is that the quality of the bottom-up inputs modulates ease of processing -- noisy inputs should lead to difficult and effortful comprehension. We test this prediction in the domain of reading. First, we propose an information-theoretic operationalization for the "quality" of bottom-up information as the mutual information (MI) between visual information and word identity. We formalize this prediction in a mathematical model of reading as a Bayesian update. Second, we test our operationalization by comparing participants' reading times in conditions where words' information quality has been reduced, either by occluding their top or bottom half, with full words. We collect data in English and Chinese. We then use multimodal language models to estimate the mutual information between visual inputs and words. We use these data to estimate the specific effect of reduced information quality on reading times. Finally, we compare how information is distributed across visual forms. In English and Chinese, the upper half contains more information about word identity than the lower half. However, the asymmetry is more pronounced in English, a pattern which is reflected in the reading times.

information, machine learning, natural language, (19 more...)

2509.17047

Country:

Europe (0.68)
North America > United States (0.68)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
(2 more...)